当前位置: 开发笔记 > 编程语言 > 正文

DGLRDKit|基于AttentiveFP可视化训练模型原子权重

作者：手机用户2502892403 | 来源：互联网 | 2023-09-06 08:34

DGL具有许多用于化学信息学、药物与生物信息学任务的函数。DGL开发人员提供了用于可视化训练模型原子权重的代码。使用AttentiveFP构建模型后，可以可视化给定

DGL具有许多用于化学信息学、药物与生物信息学任务的函数。

DGL开发人员提供了用于可视化训练模型原子权重的代码。使用Attentive FP构建模型后&＃xff0c;可以可视化给定分子的原子权重&＃xff0c;意味着每个原子对目标值的贡献量。

基于Attentive FP可视化训练模型原子权重

环境准备

PyTorch&＃xff1a;深度学习框架
DGL&＃xff1a;基于PyTorch的库&＃xff0c;支持深度学习以处理图形
RDKit&＃xff1a;用于构建分子图并从字符串表示形式绘制结构式
MDTraj&＃xff1a;用于分子动力学轨迹分析的开源库

导入库

%matplotlib inline import matplotlib.pyplot as plt import os from rdkit import Chem from rdkit import RDPathsimport dgl import numpy as np import random import torch import torch.nn as nn import torch.nn.functional as F from torch.utils.data import DataLoader from torch.utils.data import Dataset from dgl import model_zoofrom dgl.data.chem.utils import mol_to_complete_graph, mol_to_bigraphfrom dgl.data.chem.utils import atom_type_one_hot from dgl.data.chem.utils import atom_degree_one_hot from dgl.data.chem.utils import atom_formal_charge from dgl.data.chem.utils import atom_num_radical_electrons from dgl.data.chem.utils import atom_hybridization_one_hot from dgl.data.chem.utils import atom_total_num_H_one_hot from dgl.data.chem.utils import one_hot_encoding from dgl.data.chem import CanonicalAtomFeaturizer from dgl.data.chem import CanonicalBondFeaturizer from dgl.data.chem import ConcatFeaturizer from dgl.data.chem import BaseAtomFeaturizer from dgl.data.chem import BaseBondFeaturizerfrom dgl.data.chem import one_hot_encoding from dgl.data.utils import split_datasetfrom functools import partial from sklearn.metrics import roc_auc_score

代码来源于dgl/example

DGL开发人员提供了用于可视化训练模型原子权重的代码。

使用Attentive FP构建模型后&＃xff0c;可以可视化给定分子的原子权重&＃xff0c;意味着每个原子对目标值的贡献量。

def chirality(atom):try:return one_hot_encoding(atom.GetProp(&＃39;_CIPCode&＃39;), [&＃39;R&＃39;, &＃39;S&＃39;]) &＃43; \[atom.HasProp(&＃39;_ChiralityPossible&＃39;)]except:return [False, False] &＃43; [atom.HasProp(&＃39;_ChiralityPossible&＃39;)]def collate_molgraphs(data):"""Batching a list of datapoints for dataloader.Parameters----------data : list of 3-tuples or 4-tuples.Each tuple is for a single datapoint, consisting ofa SMILES, a DGLGraph, all-task labels and optionallya binary mask indicating the existence of labels.Returns-------smiles : listList of smilesbg : BatchedDGLGraphBatched DGLGraphslabels : Tensor of dtype float32 and shape (B, T)Batched datapoint labels. B is len(data) andT is the number of total tasks.masks : Tensor of dtype float32 and shape (B, T)Batched datapoint binary mask, indicating theexistence of labels. If binary masks are notprovided, return a tensor with ones."""assert len(data[0]) in [3, 4], \&＃39;Expect the tuple to be of length 3 or 4, got {:d}&＃39;.format(len(data[0]))if len(data[0]) &＃61;&＃61; 3:smiles, graphs, labels &＃61; map(list, zip(*data))masks &＃61; Noneelse:smiles, graphs, labels, masks &＃61; map(list, zip(*data))bg &＃61; dgl.batch(graphs)bg.set_n_initializer(dgl.init.zero_initializer)bg.set_e_initializer(dgl.init.zero_initializer)labels &＃61; torch.stack(labels, dim&＃61;0)if masks is None:masks &＃61; torch.ones(labels.shape)else:masks &＃61; torch.stack(masks, dim&＃61;0)return smiles, bg, labels, masksatom_featurizer &＃61; BaseAtomFeaturizer({&＃39;hv&＃39;: ConcatFeaturizer([partial(atom_type_one_hot, allowable_set&＃61;[&＃39;B&＃39;, &＃39;C&＃39;, &＃39;N&＃39;, &＃39;O&＃39;, &＃39;F&＃39;, &＃39;Si&＃39;, &＃39;P&＃39;, &＃39;S&＃39;, &＃39;Cl&＃39;, &＃39;As&＃39;, &＃39;Se&＃39;, &＃39;Br&＃39;, &＃39;Te&＃39;, &＃39;I&＃39;, &＃39;At&＃39;],encode_unknown&＃61;True),partial(atom_degree_one_hot, allowable_set&＃61;list(range(6))),atom_formal_charge, atom_num_radical_electrons,partial(atom_hybridization_one_hot, encode_unknown&＃61;True),lambda atom: [0], # A placeholder for aromatic information,atom_total_num_H_one_hot, chirality],)}) bond_featurizer &＃61; BaseBondFeaturizer({&＃39;he&＃39;: lambda bond: [0 for _ in range(10)]})train_mols &＃61; Chem.SDMolSupplier(&＃39;solubility.train.sdf&＃39;) train_smi &＃61;[Chem.MolToSmiles(m) for m in train_mols] train_sol &＃61; torch.tensor([float(mol.GetProp(&＃39;SOL&＃39;)) for mol in train_mols]).reshape(-1,1)test_mols &＃61; Chem.SDMolSupplier(&＃39;solubility.test.sdf&＃39;) test_smi &＃61; [Chem.MolToSmiles(m) for m in test_mols] test_sol &＃61; torch.tensor([float(mol.GetProp(&＃39;SOL&＃39;)) for mol in test_mols]).reshape(-1,1)train_graph &＃61;[mol_to_bigraph(mol,node_featurizer&＃61;atom_featurizer, edge_featurizer&＃61;bond_featurizer) for mol in train_mols]test_graph &＃61;[mol_to_bigraph(mol,node_featurizer&＃61;atom_featurizer, edge_featurizer&＃61;bond_featurizer) for mol in test_mols]def run_a_train_epoch(n_epochs, epoch, model, data_loader,loss_criterion, optimizer):model.train()total_loss &＃61; 0losses &＃61; []for batch_id, batch_data in enumerate(data_loader):batch_datasmiles, bg, labels, masks &＃61; batch_dataif torch.cuda.is_available():bg.to(torch.device(&＃39;cuda:0&＃39;))labels &＃61; labels.to(&＃39;cuda:0&＃39;)masks &＃61; masks.to(&＃39;cuda:0&＃39;)prediction &＃61; model(bg, bg.ndata[&＃39;hv&＃39;], bg.edata[&＃39;he&＃39;])loss &＃61; (loss_criterion(prediction, labels)*(masks !&＃61; 0).float()).mean()#loss &＃61; loss_criterion(prediction, labels)#print(loss.shape)optimizer.zero_grad()loss.backward()optimizer.step()losses.append(loss.data.item())#total_score &＃61; np.mean(train_meter.compute_metric(&＃39;rmse&＃39;))total_score &＃61; np.mean(losses)print(&＃39;epoch {:d}/{:d}, training {:.4f}&＃39;.format( epoch &＃43; 1, n_epochs, total_score))return total_scoremodel &＃61; model_zoo.chem.AttentiveFP(node_feat_size&＃61;39,edge_feat_size&＃61;10,num_layers&＃61;2,num_timesteps&＃61;2,graph_feat_size&＃61;200,output_size&＃61;1,dropout&＃61;0.2)train_loader &＃61; DataLoader(dataset&＃61;list(zip(train_smi, train_graph, train_sol)), batch_size&＃61;128, collate_fn&＃61;collate_molgraphs) test_loader &＃61; DataLoader(dataset&＃61;list(zip(test_smi, test_graph, test_sol)), batch_size&＃61;128, collate_fn&＃61;collate_molgraphs)loss_fn &＃61; nn.MSELoss(reduction&＃61;&＃39;none&＃39;) optimizer &＃61; torch.optim.Adam(model.parameters(), lr&＃61;10 ** (-2.5), weight_decay&＃61;10 ** (-5.0),) n_epochs &＃61; 100 epochs &＃61; [] scores &＃61; [] for e in range(n_epochs):score &＃61; run_a_train_epoch(n_epochs, e, model, train_loader, loss_fn, optimizer)epochs.append(e)scores.append(score) model.eval()

导入用于分子可视化依赖库

import copy from rdkit.Chem import rdDepictor from rdkit.Chem.Draw import rdMolDraw2D from IPython.display import SVG from IPython.display import display import matplotlib import matplotlib.cm as cm

定义可视化函数

代码来源于DGL库。
DGL模型具有get_node_weight选项&＃xff0c;该选项返回图形的node_weight。该模型具有两层GRU&＃xff0c;因此以下代码我将0用作时间步长&＃xff0c;因此时间步长必须为0或1。

def drawmol(idx, dataset, timestep):smiles, graph, _ &＃61; dataset[idx]print(smiles)bg &＃61; dgl.batch([graph])atom_feats, bond_feats &＃61; bg.ndata[&＃39;hv&＃39;], bg.edata[&＃39;he&＃39;]if torch.cuda.is_available():print(&＃39;use cuda&＃39;)bg.to(torch.device(&＃39;cuda:0&＃39;))atom_feats &＃61; atom_feats.to(&＃39;cuda:0&＃39;)bond_feats &＃61; bond_feats.to(&＃39;cuda:0&＃39;)_, atom_weights &＃61; model(bg, atom_feats, bond_feats, get_node_weight&＃61;True)assert timestep

`绘制测试数据集分子`

 该模型预测溶解度&＃xff0c;颜色表示红色是溶解度的积极影响&＃xff0c;蓝色是负面影响。
 target &＃61; test_loader.dataset
for i in range(len(target)):mol, aw, svg &＃61; drawmol(i, target, 0)display(SVG(svg)) 
 。。。。。 
 
 

参考资料
 1. https://github.com/dmlc/dgl/tree/master/apps/life_sci
 2. https://github.com/dmlc/dgl/blob/master/python/dgl/model_zoo/chem/attentive_fp.py
 3. https://pubs.acs.org/doi/full/10.1021/acs.jcim.9b00387




    
        
                        pytorch
                        深度学习
                        import
                        random
                        function
                        char
                        hybrid
                        io
                        uri
                    
    



    
        写下你的评论吧 !
        
            
                吐个槽吧,看都看了
            
            
                
                                        会员登录 | 用户注册
                                    
                
            
        

        
    

    
        推荐阅读
        
            
                                
                    
                        regex
                        Python正则表达式学习记录及常用方法
                    

                    
                                                
                            
                        
                                                
                        本文记录了学习Python正则表达式的过程，介绍了re模块的常用方法re.search，并解释了rawstring的作用。正则表达式是一种方便检查字符串匹配模式的工具，通过本文的学习可以掌握Python中使用正则表达式的基本方法。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2023-12-13 16:37:19
                    

                

                
                                
                    
                        int
                        伊振华作品 | 沈阳市智慧城市运行管理中心的设计与建设
                    

                    
                                                
                        本文介绍了设计师伊振华受邀参与沈阳市智慧城市运行管理中心项目的整体设计，并以数字赋能和创新驱动高质量发展的理念，建设了集成、智慧、高效的一体化城市综合管理平台，促进了城市的数字化转型。该中心被称为当代城市的智能心脏，为沈阳市的智慧城市建设做出了重要贡献。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2023-12-14 16:35:39
                    

                

                                
                    
                    
                
                
                                
                    
                        int
                        在Windows 8上安装gvim中的插件的错误加载问题
                    

                    
                                                
                        本文讨论了在Windows 8上安装gvim中插件时出现的错误加载问题。作者将EasyMotion插件放在了正确的位置，但加载时却出现了错误。作者提供了下载链接和之前放置插件的位置，并列出了出现的错误信息。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2023-12-14 14:44:00
                    

                

                
                                
                    
                        int
                        【机器学习】生成式对抗网络模型综述
                    

                    
                                                
                        生成式对抗网络模型综述摘要生成式对抗网络模型(GAN)是基于深度学习的一种强大的生成模型，可以应用于计算机视觉、自然语言处理、半监督学习等重要领域。生成式对抗网络 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2023-12-14 17:51:18
                    

                

                
                                
                    
                        int
                        向QTextEdit拖放文件的方法及实现步骤
                    

                    
                                                
                            
                        
                                                
                        本文介绍了在使用QTextEdit时如何实现拖放文件的功能，包括相关的方法和实现步骤。通过重写dragEnterEvent和dropEvent函数，并结合QMimeData和QUrl等类，可以轻松实现向QTextEdit拖放文件的功能。详细的代码实现和说明可以参考本文提供的示例代码。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2023-12-14 16:06:38
                    

                

                
                                
                    
                        int
                        CSS3选择器的使用方法详解，提高Web开发效率和精准度
                    

                    
                                                
                            
                        
                                                
                        本文详细介绍了CSS3新增的选择器方法，包括属性选择器的使用。通过CSS3选择器，可以提高Web开发的效率和精准度，使得查找元素更加方便和快捷。同时，本文还对属性选择器的各种用法进行了详细解释，并给出了相应的代码示例。通过学习本文，读者可以更好地掌握CSS3选择器的使用方法，提升自己的Web开发能力。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2023-12-14 14:37:52
                    

                

                
                                
                    
                        int
                        Open judge C16H: Magical Balls 快速幂+逆元问题解析
                    

                    
                                                
                        本文主要解析了Open judge C16H问题中涉及到的Magical Balls的快速幂和逆元算法，并给出了问题的解析和解决方法。详细介绍了问题的背景和规则，并给出了相应的算法解析和实现步骤。通过本文的解析，读者可以更好地理解和解决Open judge C16H问题中的Magical Balls部分。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2023-12-14 12:03:27
                    

                

                
                                
                    
                        int
                        sklearn数据集库中的常用数据集类型介绍
                    

                    
                                                
                            
                        
                                                
                        本文介绍了sklearn数据集库中常用的数据集类型，包括玩具数据集和样本生成器。其中详细介绍了波士顿房价数据集，包含了波士顿506处房屋的13种不同特征以及房屋价格，适用于回归任务。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2023-12-13 17:45:15
                    

                

                
                                
                    
                        int
                        游标的使用笔记
                    

                    
                                                
                        本文介绍了游标的使用方法，并以一个水果供应商数据库为例进行了说明。首先创建了一个名为fruits的表，包含了水果的id、供应商id、名称和价格等字段。然后使用游标查询了水果的名称和价格，并将结果输出。最后对游标进行了关闭操作。通过本文可以了解到游标在数据库操作中的应用。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2023-12-13 15:24:30
                    

                

                
                                
                    
                        int
                        自动轮播，反转播放的ViewPagerAdapter的使用方法和效果展示
                    

                    
                                                
                            
                        
                                                
                        本文介绍了如何使用自动轮播、反转播放的ViewPagerAdapter，并展示了其效果。该ViewPagerAdapter支持无限循环、触摸暂停、切换缩放等功能。同时提供了使用GIF.gif的示例和github地址。通过LoopFragmentPagerAdapter类的getActualCount、getActualItem和getActualPagerTitle方法可以实现自定义的循环效果和标题展示。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2023-12-13 14:41:31
                    

                

                
                                
                    
                        int
                        CF：3D City Model（小思维）问题解析和代码实现
                    

                    
                                                
                            
                        
                                                
                        本文通过解析CF：3D City Model问题，介绍了问题的背景和要求，并给出了相应的代码实现。该问题涉及到在一个矩形的网格上建造城市的情景，每个网格单元可以作为建筑的基础，建筑由多个立方体叠加而成。文章详细讲解了问题的解决思路，并给出了相应的代码实现供读者参考。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2023-12-13 14:17:11
                    

                

                
                                
                    
                        int
                        PE总结9PE文件结构之 解析导出表
                    

                    
                                                
                        本文介绍了PE文件结构中的导出表的解析方法，包括获取区段头表、遍历查找所在的区段等步骤。通过该方法可以准确地解析PE文件中的导出表信息。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2023-12-13 11:47:24
                    

                

                
                                
                    
                        int
                        引擎之旅 Chapter.2 线程库
                    

                    
                                                
                        预备知识可参考我整理的博客Windows编程之线程:https:www.cnblogs.comZhuSenlinp16662075.htmlWindows编程之线程同步:https ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2023-12-12 14:06:39
                    

                

                
                                
                    
                        int
                        Swing组件及其用法，图标接口的定义和创建方法
                    

                    
                                                
                        本文介绍了Swing组件的用法，重点讲解了图标接口的定义和创建方法。图标接口用来将图标与各种组件相关联，可以是简单的绘画或使用磁盘上的GIF格式图像。文章详细介绍了图标接口的属性和绘制方法，并给出了一个菱形图标的实现示例。该示例可以配置图标的尺寸、颜色和填充状态。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2023-12-11 21:03:59
                    

                

                
                                
                    
                        int
                        在Mac上使用Pillow加载不同字体的示例
                    

                    
                                                
                        本文介绍了如何在Mac上使用Pillow库加载不同于默认字体和大小的字体，并提供了一个简单的示例代码。通过该示例，读者可以了解如何在Python中使用Pillow库来写入不同字体的文本。同时，本文也解决了在Mac上使用Pillow库加载字体时可能遇到的问题。读者可以根据本文提供的示例代码，轻松实现在Mac上使用Pillow库加载不同字体的功能。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2023-12-11 18:33:06

















    

    
        
            
            
                
                
            

            
                手机用户2502892403            

            
                这个家伙很懒，什么也没留下！            


        
    

    
    

    
    

    
        Tags | 热门标签
        
            
                                
                    uri
                
                                
                    hashset
                
                                
                    request
                
                                
                    match
                
                                
                    text
                
                                
                    expression
                
                                
                    hook
                
                                
                    string
                
                                
                    hash
                
                                
                    express
                
                                
                    format
                
                                
                    iostream
                
                                
                    install
                
                                
                    case
                
                                
                    object
                
                                
                    datetime
                
                                
                    spring
                
                                
                    web3
                
                                
                    bitmap
                
                                
                    copy
                
                                
                    settings
                
                                
                    range
                
                                
                    int
                
                                
                    vba
                
                                
                    audio
                
                                
                    node.js
                
                                
                    regex
                
                                
                    chat
                
                                
                    md5
                
                                
                    window
                
                                
            
        
    

    
    
        
            
            
        
        RankList | 热门文章
        
            
                                
                    1ORACLE11g错误密码登录10次将被锁定
                
                                
                    2吴恩达机器学习（十三）—— 推荐系统
                
                                
                    3ReactNativewindows环境搭建记录
                
                                
                    4mongodb_Mongodb3.6 基操命令——help有大用
                
                                
                    5PHP递归实现层级树状展开办法
                
                                
                    6浏览器如何工作（How browsers work）的阅读笔记
                
                                
                    7Django中间件的编写
                
                                
                    8Python教程分享：使用plt.text给图中的点加标签，让Python画图更生动
                
                                
                    9C++ 类的 this 指针 语法练习5
                
                                
                    10ubuntu 16.04 安装 python 3.6、python 3.7
                
                                
                    11arggis sever 管理界面无法打开
                
                                
                    12每日算法——并查集的应用
                
                                
                    13怎么查询电脑ip地址
                
                                
                    14再见“电脑弹窗”：3个小设置，让电脑桌面干净如新机
                
                                
                    15idea 启动选择profiles_Idea配合maven使用之profiles